AITopics | Bowman County

Collaborating Authors

Bowman County

PromptSuite: A Task-Agnostic Framework for Multi-Prompt Generation

Habba, Eliya, Dahan, Noam, Lior, Gili, Stanovsky, Gabriel

arXiv.org Artificial IntelligenceSep-23-2025

Evaluating LLMs with a single prompt has proven unreliable, with small changes leading to significant performance differences. However, generating the prompt variations needed for a more robust multi-prompt evaluation is challenging, limiting its adoption in practice. To address this, we introduce PromptSuite, a framework that enables the automatic generation of various prompts. PromptSuite is flexible - working out of the box on a wide range of tasks and benchmarks. It follows a modular prompt design, allowing controlled perturbations to each component, and is extensible, supporting the addition of new components and perturbation types. Through a series of case studies, we show that PromptSuite provides meaningful variations to support strong evaluation practices. All resources, including the Python API, source code, user-friendly web interface, and demonstration video, are available at: https://eliyahabba.github.io/PromptSuite/.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2507.14913

Country:

Asia > Thailand > Bangkok > Bangkok (0.04)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
(3 more...)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

Position: We Need Responsible, Application-Driven (RAD) AI Research

Hartman, Sarah, Ong, Cheng Soon, Powles, Julia, Kuhnert, Petra

arXiv.org Artificial IntelligenceAug-20-2025

This position paper argues that achieving meaningful scientific and societal advances with artificial intelligence (AI) requires a responsible, application-driven approach (RAD) to AI research. As AI is increasingly integrated into society, AI researchers must engage with the specific contexts where AI is being applied. This includes being responsive to ethical and legal considerations, technical and societal constraints, and public discourse. We present the case for RAD-AI to drive research through a three-staged approach: (1) building transdisciplinary teams and people-centred studies; (2) addressing context-specific methods, ethical commitments, assumptions, and metrics; and (3) testing and sustaining efficacy through staged testbeds and a community of practice. We present a vision for the future of application-driven AI research to unlock new value through technically feasible methods that are adaptive to the contextual needs and values of the communities they ultimately serve.

artificial intelligence, machine learning, rad-ai, (15 more...)

arXiv.org Artificial Intelligence

2505.04104

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Oceania > Australia > Western Australia (0.04)
Oceania > Australia > Northern Territory > Darwin (0.04)
(6 more...)

Genre: Research Report (1.00)

Industry:

Law (1.00)
Health & Medicine (1.00)
Government (1.00)
Food & Agriculture > Agriculture (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Applied AI (0.93)

Add feedback

Auto Review: Second Stage Error Detection for Highly Accurate Information Extraction from Phone Conversations

Qamar, Ayesha, Raghuvanshi, Arushi, Sathi, Conal, Son, Youngseo

arXiv.org Artificial IntelligenceJun-9-2025

Automating benefit verification phone calls saves time in healthcare and helps patients receive treatment faster. It is critical to obtain highly accurate information in these phone calls, as it can affect a patient's healthcare journey. Given the noise in phone call transcripts, we have a two-stage system that involves a post-call review phase for potentially noisy fields, where human reviewers manually verify the extracted data$\unicode{x2013}$a labor-intensive task. To automate this stage, we introduce Auto Review, which significantly reduces manual effort while maintaining a high bar for accuracy. This system, being highly reliant on call transcripts, suffers a performance bottleneck due to automatic speech recognition (ASR) issues. This problem is further exacerbated by the use of domain-specific jargon in the calls. In this work, we propose a second-stage postprocessing pipeline for accurate information extraction. We improve accuracy by using multiple ASR alternatives and a pseudo-labeling approach that does not require manually corrected transcripts. Experiments with general-purpose large language models and feature-based model pipelines demonstrate substantial improvements in the quality of corrected call transcripts, thereby enhancing the efficiency of Auto Review.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2506.054

Country:

North America > United States > North Dakota > Bowman County (0.15)
Asia > Thailand > Bangkok > Bangkok (0.05)
Europe > United Kingdom > North Sea > Southern North Sea (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
(2 more...)

Add feedback

Multi-Task Corrupted Prediction for Learning Robust Audio-Visual Speech Representation

Kim, Sungnyun, Cho, Sungwoo, Bae, Sangmin, Jang, Kangwook, Yun, Se-Young

arXiv.org Artificial IntelligenceMay-1-2025

Audio-visual speech recognition (AVSR) incorporates auditory and visual modalities to improve recognition accuracy, particularly in noisy environments where audio-only speech systems are insufficient. While previous research has largely addressed audio disruptions, few studies have dealt with visual corruptions, e.g., lip occlusions or blurred videos, which are also detrimental. To address this real-world challenge, we propose CAV2vec, a novel self-supervised speech representation learning framework particularly designed to handle audio-visual joint corruption. CAV2vec employs a self-distillation approach with a corrupted prediction task, where the student model learns to predict clean targets, generated by the teacher model, with corrupted input frames. Specifically, we suggest a unimodal multi-task learning, which distills cross-modal knowledge and aligns the corrupted modalities, by predicting clean audio targets with corrupted videos, and clean video targets with corrupted audios. This strategy mitigates the dispersion in the representation space caused by corrupted modalities, leading to more reliable and robust audio-visual fusion. Our experiments on robust AVSR benchmarks demonstrate that the corrupted representation learning method significantly enhances recognition accuracy across generalized environments involving various types of corruption. Our code is available at https://github.com/sungnyun/cav2vec.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2504.18539

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Texas > Tom Green County (0.04)
(3 more...)

Genre: Research Report > New Finding (0.67)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

AAAI News

Hamilton, Carol

AI MagazineMar-15-2002

The AAAI Press - Distributed by The MIT Press Massachusetts Institute of Technology, 5 Cambridge Center, Cambridge, Massachusetts 02142 To order, call toll free: (800) 356-0343 or (617) 625-8569. SPRING 2002 5 first time that AAAI's National conference has been held in Canada--a In addition, the program chairs are experimenting with a new format for AAAI.

artificial intelligence, machine learning, university, (15 more...)

AI Magazine

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.34)
North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.14)
North America > Canada > Ontario > Toronto (0.14)
(17 more...)

Genre: Personal > Honors (0.46)

Industry:

Education (1.00)
Leisure & Entertainment (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback